Semantic Segmentation of Video Collections using Boosted Random Fields
نویسندگان
چکیده
Multimedia documentalists need effective tools to organize and search into large video collections. Semantic video structuring consists in automatically extracting from the raw data the inner structure of a video collection. This high-level information if automatically extracted would provide important meta information enabling the development of an important new range of applications to browse and search video collections. In this paper, we present the feature extraction process providing a compact description of the audio, visual and text modalities. To reach the semantic level required, a contextual model is then proposed: it is a complex model which takes into account not only the link between features and labels but also the compatibility between labels associated with different modalities for improved consistency of the results. Boosted Random Fields are used to learn these relationships. It provides an iterative optimization framework to learn the model parameters and uses the abilities of boosting to reduce classification errors, to avoid over-fitting and to achieve the task of feature selection. We experiment using the TRECvid corpus and show results that validate the approach over existing studies.
منابع مشابه
SIDF: A Novel Framework for Accurate Surgical Instrument Detection in Laparoscopic Video Frames
Background and Objectives: Identification of surgical instruments in laparoscopic video images has several biomedical applications. While several methods have been proposed for accurate detection of surgical instruments, the accuracy of these methods is still challenged high complexity of the laparoscopic video images. This paper introduces a Surgical Instrument Detection Framework (SIDF) for a...
متن کاملImproving Semantic Video Segmentation by Dynamic Scene Integration
Multi-class image segmentation and pixel-level labeling of the frames that make up a video could be made more efficient by incorporating temporal information. Recently, Convolutional Neural Networks (ConvNets) have made an impressive positive impact on the single image segmentation problem. In this paper, in order to further increase labeling accuracy, we propose a method for integrating short-...
متن کاملMulti-class Video Objects Segmentation Based on Conditional Random Fields
Video object segmentation has been widely used in many fields. A conditional random fields (CRF) model is proposed to achieve accurate multi-class segmentation of video objects in the complex environment. By using CRF, the color, texture, motion characteristics and neighborhood relations of objects are modeled to construct the corresponding energy functions in both the temporal and spatial doma...
متن کاملFast Bilateral Solver for Semantic Video Segmentation
We apply the fast bilateral solver technique to the problem of real-time semantic video segmentation. While structured prediction by a dense CRF is accurate on video datasets, the performance is not adequate for real-time segmentation. We hope to utilize the efficient smoothing methodology from the fast bilateral solver within the video segmentation framework introduced by Kundu et al. [9], imp...
متن کاملUnsupervised Total Variation Loss for Semi-supervised Deep Learning of Semantic Segmentation
We introduce a novel unsupervised loss function for learning semantic segmentation with deep convolutional neural nets (ConvNet) when densely labeled training images are not available. More specifically, the proposed loss function penalizes the L1-norm of the gradient of the label probability vector image , i.e. total variation, produced by the ConvNet. This can be seen as a regularization term...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005